ZZZ[1,ALS]1 - www.SailDart.org

perm filename ZZZ[1,ALS]1 blob sn#001059 filedate 1972-08-28 generic text, type T, neo UTF8
00010		Some Preliminary Experiments in Speech Recognition
00020			using Signature Table Learning
00030	
00040	       	 		   by
00050			R.B.Thosar and A.L.Samuel
00060	
00070		A limited amount of success has been achieved in the
00080		application of the signature table scheme of machine
00090		learning to the problem of automatic speech recognition.
00100		The scheme is based on the assumption that the recognition
00110		system must eventually employ a learning mechanism and that
00120		the acoustic part of the system must start by dealing
00130		with the recognition of fairly elemental speech segments
00140		rather than with words if it is to have general utility.
00150	
00160	
00170	
00180		The  experiments  here  reported are only the early beginning
00190	part of  a  long  range  program  to  devise  elements  of  a  speech
00200	recognition  system  that  would  not  be dependent upon the use of a
00210	limited vocabulary and that could recognize continuous  speech  by  a
00220	number  of  different  speakers. The aim is that the system should be
00230	able to function successfully either without  any  previous  training
00240	for  the  specific  speaker  in  question  or  after a short learning
00250	session in which the speaker would be asked to repeat certain phrases
00260	designed to train the system on those phonetic utterances that seemed
00270	to depart from the previously learned norm.
00280	
00290		At the present time we are not attempting to build a complete
00300	operating  system.  Rather, the  attempt  is  to concentrate on those
00310	aspects of the  general  speech  recognition  problem  that  seem  to
00320	require the greatest amount of work or that are within our particular
00330	field of competance. It is hoped that our work  will,  in  this  way,
00340	supplement  the work on complete systems that is underway at a number
00350	of different locations.
00360	
00370		We are currently attempting  to  apply  the  signature  table
00380	learning  scheme,  previously  used  in another connection(1), to the
00390	problem of  phoneme  identification.  This  scheme  makes  use  of  a
00400	hierarchy  of  tables  that contain identifying information which has
00410	been derived from learning sessions, as will be described  below.  By
00420	restricting  all  of  the  speech-specific  aspects to data stored in
00430	tables, the instruction  sequence  that  processes  these  tables  is
00440	independent  of  the  nature  of the information being processed. The
00450	same routine that processes those tables  whose  inputs  are  derived
00460	from  the  acoustic  input  can  also  be used to process tables with
00470	syntactic,semantic and linguistic inputs as  well.  Signature  tables
00480	are thus viewed as a basic tool that deserves special study.
00490	
00500		Signature  tables  can  be  used  to  perform  four essential
00502	
00504	
00506					(1)
00508	
     

00010	functions that are required in the automatic recognition  of  speech.
00020	These   functions  are:   (1)  the  elimination  of  superfluous  and
00030	redundant information information from the acoustic input stream, (2)
00040	the  transformation  of the remaining information from one coordinate
00050	system to a more phonetically meaningful coordinate system,  (3)  the
00060	mixing  of  acoustically  derived  data  with syntactic, semantic and
00070	linguistic information to obtain the desired recognition, and (4) the
00080	introduction  of  a  learning mechanism. Signature tables differ from
00090	the perceptron to which they are often erroneously compared  in  that
00100	all  possible functions of the inputs can be represented and reported
00110	as the output, subject only to the restriction imposed by the digital
00120	nature  and  limited  range  of  the  permited output. A hierarchy of
00130	tables thus provides a mechanism  for  the  systematic  reduction  in
00140	information  content  by  the elimination of extraneous and redundant
00150	information. If the hierarchy is properly  designed  one  can  expect
00160	that this reduction in total information will be obtained without any
00170	loss in the desired information. Obviously considerable care must  be
00180	exercised in the design of the tables and of the hierarchy to achieve
00190	this desirable objective. We are attempting to do this  by  designing
00200	the  signature  table  hierarchy along conventional phonetic lines so
00210	that we can take full advantage of the wealth of phoneticly  oriented
00220	research that has been and is being done in many places.
00230	
00240		The  signature  tables, as used in speech recognition, differ
00250	in a number  of  significant  respects  from  the  tables  previously
00260	described.
00270	
00280		A  signature  table consists of two parts, a preamble and the
00290	table proper. The preamble contains: (1) space for saving a record of
00300	the current and recent output reports from the table, (2) identifying
00310	information as to the specific type of table, (3)  a  parameter  that
00320	identifies  the desired output from the table and that is used in the
00330	learning process, (4) a gating parameter specifying the  input,  that
00340	is  to be used to gate the table, (6) the gating level to be used and
00350	(7) parameters that identify the sources of the normal inputs to  the
00360	table.
00370	
00380		All  inputs  are  limited  in  range  and  specify either the
00390	absolute level of some basic property or more usually the probability
00400	of  some  property  being  present.   These  inputs  may  be from the
00410	original acoustic input or they may be the outputs of  other  tables.
00420	If  from  other  tables  they may be for the current time step or for
00430	earlier time steps, (subject to practical limits as to the number  of
00440	time steps that are saved).
00450	
00460		The output, or outputs, from each table are similarly limited
00470	in  range  and  specify,  in  all  cases,  a  probability  that  some
00480	particular significant feature, phonette, phoneme, word segment, word
00490	or phrase is present.
00500	
00502	
00504	
00506					(2)
00508	
     

00010		We are limiting the range of inputs  and  outputs  to  values
00020	specified  By  3  bits  and  the  number  of  entries per table to 64
00030	although this choice of values  is  a  matter  to  be  determined  by
00040	experiment.  We  are  also  providing  for any of the following input
00050	combinations, (1) one input of 6 bits, (2) two inputs of 3 bits each,
00060	(3)  three  inputs  of 2 bits each, and (4) six inputs of 1 bit each.
00070	The uses to which these differint forms are  put  will  be  described
00080	later.
00090	
00100		The  body  of  each  table  contains entries corresponding to
00110	every possible combination of  the  allowed  input  parameters.  Each
00120	entry  in  the  table  actually  consists of several parts. There are
00130	fields assigned to accumulate counts of the occurrances of  incidents
00140	in  which  the  specifying  input values coincided with the different
00150	desired outputs from the table  as  found  during  previous  learning
00160	sessions  and  there  are fields containing the summarized results of
00170	these learning sessions, which are used as outputs  from  the  table.
00180	The  outputs from the tables can then express to the allowed accuracy
00190	all possible functions of the input parameters.
00200	
00210		When operating in the learning mode the program  is  supplied
00220	with  a  sequence  of  stored  utterances  with accompanying phonetic
00230	transcriptions.  Each  segment  of  the  incoming  speech  signal  is
00240	analysed  (Fourier transforms or inverse filter equivalent) to obtain
00250	the necessary input parmeters for the  lowest  level  tables  in  the
00260	signature  table  hierarchy.  At the same time reference is made to a
00270	table of phonetic "hints" which prescribe the  desired  outputs  from
00280	each  table  which  correspond  to  all possible phonemic inputs. The
00290	signature tables are then processed.
00300	
00310		The processing of each  table  is  done  in  two  steps,  one
00320	process  at each entry to the table and the second only periodically.
00330	The first process consists of locating a single entry line within the
00340	table  as  specified by the inputs to the table and adding a 1 to the
00350	appropriate field to indicate the presence of the property  specified
00360	by  hint  table  as  corresponding  to  the  phoneme specified in the
00370	phonemic transcription. At this time a report is also made as to  the
00380	table's  output  as  determined from the averaged results of previous
00390	learning so that a running record may be kept of the  performance  of
00400	the   system.  At  periodic  intervals  all  tables  are  updated  to
00410	incorporate recent learning results.  To  make  this  process  easily
00420	understandable,  let  us  restrict  our  attention to a table used to
00430	identify a single significant feature say  Voicing.  The  hint  table
00440	will identify whether or not the phoneme currently being processed is
00450	to be considered voiced. If it is voiced, a 1 is added to  the  "yes"
00460	field of the entry line located by the normal inputs to the table. If
00470	it is not voiced, a 1 is added to the "no" field.  At  updating  time
00480	the  output that this entry will subsequently report is determined by
00490	dividing the accumulated sum in the "yes" field by  the  sum  of  the
00500	numbers in the "yes" and the "no" fields, and reporting this quantity
00502	
00504	
00506					(3)
00508	
     

00010	as a number in the range from 0 to 7. Actually the process is  a  bit
00020	more complicated than this and it varies with the exact type of table
00030	under consideration, as reported in detail  in  appendix  B.  Outputs
00040	from the signature tables are not probabilities, in the strict sense,
00050	but  are  the  statistically-arrived-at  odds  based  on  the  actual
00060	learning sequence.
00070	
00080		The  preamble  of the table has space for storing tweive past
00090	outputs. An input to a table can be delayed to that extent.This table
00100	relates  outcomes  of  previous  events  with  the  present  hint-the
00110	learning input.A certain amount of context dependent learning is thus
00120	possible with the limitation that the specified delays are constant.
00130	
00140		The  interconnected  hierarchy of tables form a network which
00150	runs increamentally, in steps synchronous with time window over which
00160	the  input signal is analised.The present window width is set at 12.8
00170	ms.(256 points at 20 K samples/sec.) with overlap of 6.4  ms.  Inputs
00180	to  this  network  are  the  parameters abstracted from the frequency
00190	analyses of the signal, and the specified  hint.The  outputs  of  the
00200	network  could  be  either the probability attached to every phonetic
00210	symbol or the output of a table associated with  a  feature  such  as
00220	voiced,vowel  ect.The  point  to be made is that the output generated
00230	for  a  segment  is  essentially  independent   of   its   contiguous
00240	segments.The  dependency  achieved  by using delayes in the inputs is
00250	invisible to the outputs.The outputs thus report the best estimate on
00260	what  the  current  acoustic  input  is  with no relation to the past
00270	outputs.Relating the successive outputs along the time  dimension  is
00280	realised by counters.
00290	
00300		A  counter provides a mechanism for indicating when an output
00310	reaches sigificant level and the period for which it  remains  high.A
00320	counter   is   triggered   when   its   input   crosses  a  specified
00330	threshold.Momentary  spikes  are  eliminated   by   specifying   time
00340	hysteresis,  the  numBer  of consecutive segments for which the input
00350	must  be  above  the  threshold.The  output  of  a  counter  provides
00360	information  about  starting  time,duration and average input for the
00370	period it was active.
00380	
00390		Since a counter can reference a table at  any  level  in  the
00400	hierarchy of tables, it can reflect any desired degree of information
00410	reduction. For example, a counter may be set up to show a section  of
00420	speech  to be a vowel,a front vowel or the vowel /I/.The counters can
00430	be looked upon to represent a mapping of parameter-time space into  a
00440	feature-time  space, or at a higher level symbol-time space.It may be
00450	useful to carry along the feature information as a back up  in  those
00460	situations  where  the  symbolic  information  is  not  acceptable to
00470	syntactic or semantic interpretation.
00480	
00490		In the same manner as the tables, the counters run completely
00500	independent  of  each  other.In  a  recognition  run the counters may
00502	
00504	
00506					(4)
00508	
     

00010	overlap in arbitrary fashion, may leave out gaps where no counter has
00020	been triggered or may not line up nicely.A properly segmented output,
00030	where the consecutive sections are in time sequence  and  are  neatly
00040	labled,  is  essential  for processing it further.This is achieved by
00050	registering  the  instants  when  the  counters  are   triggered   or
00060	terminated to form time segments called events.
00070	
00080		An  event  is  the  period  between  successive activation or
00090	termination of any counter.An event shorter than a specified time  is
00100	merely  ignored.   A  record of event durations and upto three active
00110	counters, ordered according to their probability, is maintained.
00120	
00130		An event resulting from  the  processing  described  so  far,
00140	represents  a phone or a phonette-the basic speech categories defined
00150	as hints in the learning process. It is only an estimate of closeness
00160	to  a speech category , based on past learning.Also each category has
00170	a more-or-less stationary spectral characterisation.Thus  a  category
00180	may  have  a  phonemic  equivalent  as in the case of vowels , may be
00190	common to phoneme class as for the voiced or unvoiced  stop  gaps  or
00200	may  be  subphonemic as a T-burst or a K-burst.They should be and are
00210	based on acoustic  expediency,  i.e.  optimisation  of  the  learning
00220	rather  than  any  linguistic  considerations.However  a higher level
00230	interpretive programs may best operate on inputs resembling  phonemic
00240	trancription.The contiguous events may be coalesced into phoneme like
00250	units using diadic or  triadic  probabilities  and  acoustic-phonetic
00260	rules  particular  to  the  system.For  example,  a period of silence
00270	followed by a type of burst or a short friction may  be  combined  to
00280	form  the  corrosponding stop.A short friction or a burst following a
00290	nasal or a lateral may be called a stop even if the silence period is
00300	short  or  absent.Clearly these rules must be specific to the system,
00310	based on the confidence with which durations and phonette  categories
00320	are recognised.
00330	
00340		How  far can such an bottom-up approach be pushed? In absence
00350	of a higer interpretive program the first order estimate generated by
00360	the  acoustic  processor  cannot  be improved upon.The system however
00370	does provide several levels of backup if its output is not acceptable
00380	to  the  interpreter.The lower order events and the feature space are
00390	available  for  immidiate  queries.If  the   interpreter   has   high
00400	expectation for a class, say nasals, for a segment of speech, a rerun
00410	can force a choice with suitable modification  of  gating  thresholds
00420	for  selected  tables and counters. Thus no basic modification in the
00430	system seems necessary to provide a backtrack capability  -  probably
00440	an important requirement in a complete speech recognition system.
00450	
00460		Foregoing,  rather  long  introduction was intended to convey
00470	the methodology we have adopted in tackling  the  speech  recognition
00480	problem.  So  far  the  system  has  all  the  machinary required for
00490	learning and generating event sequences for arbitrary speech  inputs,
00500	and  for the evaluation of the learning and recognition processes.The
00502	
00504	
00506					(5)
00508	
     

00010	phonette categories and the table network  is  constucted  along  the
00020	traditional  phonetic  approach.The  final  setup  may well reflect a
00030	categorisation   which   produces   statistically   optimal   result.
00040	Operational  details  of  the  current  system  and  some preliminary
00050	results are included in the following appendices.
00060	
00070		Appendix A describes routines for parameter  extraction  from
00080	the fourier and linear prediction analysis.
00090	
00100		Appendix B describes the signature tables.
00110	
00120		Appendix  C  describes  the present table network and some of
00130	the results.
00140	
00150	
00160	
00170	
00180	
00190	
00200	
00210	
00220	
00230	
00240	
00250	
00260	
00270	
00280	
00290	
00300	
00310	
00320	
00330	
00340	
00350	
00360	
00370	
00380	
00390	
00400	
00410	
00420	
00430	
00440	
00450	
00460	
00470	
00480	
00490	
00500	
00502	
00504	
00506					(6)
00508	
     

00010				APPENDIX A
00020	
00030			Extraction of Speech Parameters
00040	
00050	
00060		The  acoustic  signal  is  analised in regular 12.8 ms. steps
00070	with an overlap of 6.4 ms.Each segment is multiplied by  the  Hamming
00080	window and a standerd FFT or a linear prediction algorithm is applied
00090	to obtain a log-magnitude spectrum. The FFT routine is  faster  MACRO
00100	version of a FORTRAN program developed at the University of Utah. The
00110	linear prediction scheme uses Markel's code.
00120	
00130		A uniform procedure is applied to the spectrum  to  derive  a
00140	set  of  19 parameters. These include the first three formants, nasal
00150	and fricative poles and zeros, the amplitudes  at  the  corrosponding
00160	points and average energies in selected frequency regions.Most of the
00170	poles and zeros are located as  peaks  and  minimas  in  a  specified
00180	frequency  range.A  slightly  more  complicated  procedure is used to
00190	determine the three formants for which  the  ranges  are  allowed  to
00200	overlap.
00210	
00220		The  ambiguous situation where the second formant is confused
00230	with either the first or the third  formant  is  resolved  using  two
00240	conditions.  If another "prominent" peak is found in the range, it is
00250	labeled as second formant. Otherwise the spectral  balance  given  by
00260	ratio  of  energies  in F1 and F3 regions is used to place the second
00270	formant close to F1 or F3.
00275	
00280		The  average  energy is obtained by averaging over the entire
00290	frequency range. In order to avoid sharp changes in the low and  high
00300	frequency  energies  ,  the  magnitude  is  linearly reduced from the
00310	"break-point" to the "cutoff-point".
00320	       The  nomenclature and the parameter ranges are given in Table 1.
00325	
00330		The linear prediction scheme uses identical parameter  ranges
00340	and almost the same procedures. The iterative peak locating procedure
00350	is somewhat simplified since a  much  cleaner  spectrum  without  any
00360	local irregularity, is obtained from linear prediction.
00370	
00380	
00390	
00400	
00410	
00420	
00430	
00440	
00450	
00460	
00470	
00480	
00490	
00500	
00510	
00520	
00530					(7)
00540	
     

00010		Parameter		Lower Limit	Upper Limit (Hz)
00020	
00030		F1: first formant	200		800
00040		F2: second formant	700		2050
00050		F3: third formant	2000		3200
00060		A1: F1 amplitude
00070	   	A2: F2 amplitude
00080		A3: F3 amplitude
00090		FP1: fricative pole 1	1800		3200
00100		FP2: fricative pole 2	3200		5000
00110		FP1A: FP1 amplitude
00120		FP2A: FP2 amplitude
00130		FZ: fricative zero	FP1		FP2
00140		FZA: FZ amplitude
00150		NP: nasal pole		800		1500
00160		NZ: nasal zero		NP		NP+500
00170		NPA: NP amplitude
00180		NZA: NZA amplitude
00190		LPE: low region energy	0		450
00200		HPE: high region energy	2500		10000
00210		AVE: average energy	0		10000
00220	
00230	
00240		Table 1. Input Parameters and Their Ranges
00250	
00260	
00270	
00280	
00290	
00300	
00310	
00320	
00330	
00340	
00350	
00360	
00370	
00380	
00390	
00400	
00410	
00420	
00430	
00440	
00450	
00460	
00470	
00480	
00490	
00500	
00502	
00504	
00506					(8)
00508	
     

00010				APPENDIX B
00020	
00030			     Signature Tables
00040	
00050	
00060		The  signature  tables  are  basically three types: (1) Input
00070	tables are designed  to  compress  a  parameter  range  and  make  it
00080	compatible  with  the  rest  of the system. (2) Intermediate P-tables
00090	learn on a single feature  and  have  a  single  output.  (3)  Output
00100	Q-tables learn on a set of phonetically similar sounds, upto four per
00110	table, and have the corresponding outputs and a fifth null output.
00120	
00130		As mentioned earlier, each table is 64 words long with a  ten
00140	word  preamble  ,  exept  that  a Q table is twice as long to provide
00150	space for the five fields.
00160	
00170		Every line in an input table has two fields, one contains the
00180	count  of  the  number of times that particular input occured and the
00190	other contains the output  value  associated  with  that  count.  The
00200	parameter range is adjusted to be 0:63 so that it's value can be used
00210	directly to locate the proper table address. The computation  of  the
00220	line outputs is done so as to maximise the information content in the
00230	output,i.e. every output value has equal  probability  of  occurence.
00240	This is realised by dividing up the table in eight sections such that
00250	sum of the counts in each section is almost 1/8th of the total count.
00260	This updating is done after eight learning inputs in early stages and
00270	less frequently afterwards.
00280	
00290		The P tables can have two 3-bit inputs, three 2-bit inputs or
00300	six  1-bit  inputs.  The  least  significant  bits of the 3-bit table
00310	output usually generated, are ignored wherever necessary.  The  joint
00320	six  bits  thus  provide  the  correct  address within the table. The
00330	gating input to a table is obtained  from  another  table.The  gating
00340	level  can  be  a  positive  or  a  negative number between 0 and 7.A
00350	positive threshold turns a gate on if the gating input is  above  the
00360	threshold.  Converse is true for a negative threshold.Every line in a
00370	table has two fields which accumulate counts  and  a  field  for  the
00380	output.A  1 is added to good count (G) field if the learning input is
00390	on  otherwise  a  1  is  added  to  the  bad  count  (B)  field.  The
00400	corresponding  outputs  are  computed by normalising the ratios G/G+B
00410	into 0 to 7 range.
00420	
00430		The input and  gating  configuration  for  the  Q  tables  is
00440	identical  to  the  P  tables.Each line of the Q table has five count
00450	fields, four for the learning inputs and  the  fifth  for  "not-any".
00460	Only  the  positive counts are kept.If none of the learning inputs is
00470	present the not-any field is increamented.The output corresponding to
00480	a  hint is computed on the basis of the counts in other fields in the
00490	same line.
00500	
00510	
00520					(9)
00530	
     

00010	
00020		The learning network is created by a program called MAKE. The
00030	tables are assigned consecutive 74 word blocks in the space allocated
00040	for the tables. The first  word  in  the  preamble  is  reserved  for
00050	current  and past outputs of the table. An input to the table is thus
00060	merely a byte pointer pointing to the correct table. The byte pointer
00070	is  itself  stored  in  the body of the preamble.The code running the
00080	tables is thus greatly simplified.
00090	
00100		Execution of the tables is identical in learning as  well  as
00110	recognition  modes except that no counts are added in the recognition
00120	mode.This simplifies the code even further.
00130	
00140	
00150	
00160	
00170	
00180	
00190	
00200	
00210	
00220	
00230	
00240	
00250	
00260	
00270	
00280	
00290	
00300	
00310	
00320	
00330	
00340	
00350	
00360	
00370	
00380	
00390	
00400	
00410	
00420	
00430	
00440	
00450	
00460	
00470	
00480	
00490	
00500	
00502	
00504	
00506					(10)
00508	
     

00010	
00020				APPENDIX C
00030	
00040			      Current System
00050	
00060	
00070		The current system is probably best described with respect to
00080	the  documentation  generated  by the program MAKE when the system is
00090	created. The documentation is given as Tables 1-4 in this appendix.
00100	
00110		The number  of  acoustic  input  parameters,  their  mnemonic
00120	representation and the order in which they are supplied is determined
00130	by the parameter extraction routine. MAKE generates  an  input  table
00140	for  each  parameter.  Henceforth  the  table  is  referenced  by its
00150	mnemonic tag. A list of current tables appears in Table 1.
00160	
00170		The  system  allows  upto  36  significant  features  to   be
00180	specified.  The present set of features is given in Table 2. The list
00190	of sound categories with their  associated  significant  features  is
00200	also  in  Table  2.  Most  of  the features retain their conventional
00210	significance.Rather   artificial   features   like   VOC1,VOC2   were
00220	introduced to separate members of subclass such as front vowels.
00230	
00240		The  features  have  been chosen so as to isolate a subset of
00250	phonettes which can be consistently identified.  Features  which  may
00260	define a large subset have been avoided for two reasons.First, it may
00270	tend to smear a table reducing its  effeciveness.Second,  the  subset
00280	defined  by negation of a feature, say ¬VOWEL, appears to be a weaker
00290	decision by a table.
00300	
00310		The signature tables currently in  use  are  given  in  Table
00320	3.Dummy  tables  are  inserted  between  two levels of the tables for
00330	later additions.The number attached  to  the  table  type  gives  the
00340	number  of  inputs.  The number adjacent to an input shows the delay.
00350	The first value attached to a gating input is the threshold.
00360	
00370		The set of counters that are being used is given in Table 4.
00380	
00390	
00400		The  data being used for experimentation with the system is a
00410	set of 54 words recorded by Dr. Ken Stevens.  These  recordings  have
00420	been  used  earlier  by Gold, Bobrow and Klatt, Vicens, and Erman and
00430	Reddy. Only a limited amount  of  learning  has  been  done,  and  no
00440	attempt  has been made to modify the set of phonetts or the tables to
00450	improve performance.
00460	
00470	
00480	
00490	RESULTS-------
00500	
     

00010		SIGNATURE TABLE SET-UP AS OF 24-JUL-1972   1421:49
00020	
00030	The following input tables exist
00040	
00050	F1    	F2    	F3    	A1    	A2    	A3    	FP1   	FP1A  	
00060	FP2   	FP2A  	FZ    	FZA   	NP    	NPA   	NZ    	NZA   	
00070	LPE   	AVE   	HPE   	
00080	
00100		Table 1. List of Input Parameters.
00110	
00120	
00140	Available SIGNIFICANT FEATURES are
00150	
00160	VOICED	FRIC  	VOWEL 	GLIDE 	NASAL 	STOP  	BURST 	FRONT 	
00170	MID   	BACK  	VOC1  	VOC2  	GLI1  	GLI2  	NAS1  	NAS2  	
00180	VF1   	VF2   	FRIC1 	FRIC2 	FRIC3 	FRICX 	NASGLI	VOIFRI	
00190	
00210	PH list and H list table contains
00220	
00230	PH	Significant features
00240	NU    	
00250	EE    	VOICED VOWEL  FRONT  
00260	AE    	VOICED VOWEL  FRONT  VOC1   
00270	E     	VOICED VOWEL  FRONT  VOC2   
00280	I     	VOICED VOWEL  FRONT  VOC1   VOC2   
00290	AS    	VOICED VOWEL  MID           
00300	AA    	VOICED VOWEL  MID    VOC1   
00310	AR    	VOICED VOWEL  MID    VOC2   
00320	A     	VOICED VOWEL  MID    VOC1   VOC2   
00330	OO    	VOICED VOWEL  BACK   
00340	U     	VOICED VOWEL  BACK   VOC1   
00350	AW    	VOICED VOWEL  BACK   VOC2   
00360	O     	VOICED VOWEL  BACK   VOC1   VOC2   
00370	Y     	VOICED GLIDE  GLI1   NASGLI 
00380	R     	VOICED GLIDE  NASGLI 
00390	L     	VOICED GLIDE  GLI1   GLI2   NASGLI 
00400	W     	VOICED GLIDE  GLI2   NASGLI 
00410	NG    	NASAL  NASGLI 
00420	M     	VOICED NASAL  NAS1   NASGLI 
00430	N     	VOICED NASAL  NAS2   NASGLI 
00440	F     	FRIC   FRIC1  
00450	S     	FRIC   FRIC2  
00460	SH    	FRIC   FRIC3  
00470	H     	FRIC   
00480	V     	VOICED FRIC   VF1    VOIFRI 
00490	Z     	VOICED FRIC   VF2    VOIFRI 
00500	ZH    	VOICED FRIC   VOIFRI 
00510	PB    	FRIC   BURST  FRIC1  
00520	TB    	FRIC   BURST  FRIC2  
00530	KB    	FRIC   BURST  FRIC3  
00540	SI    	STOP   
00550	VS    	VOICED STOP   
00560	
00570	   	Table 2. List of Phonettes and Associated Features.   (12)
     

00010	The following tables exist
00020	
00030	Name	TYPE	Learn	Gate	IN1	IN2	IN3	IN4	IN5	IN6
00040	VOICED	P2    VOICED	0 0LPE	0LPE  	0HPE  	      	      	      	      
00050	VOWEL 	P2    VOWEL 	4 0VOI	0LPE  	0AVE  	      	      	      	      
00060	DUM1  	P2          	0 0   	0     	0     	      	      	      	      
00070	FRIC1 	P2    FRIC  	11 0VO	0LPE  	0HPE  	      	      	      	      
00080	STOP1 	P3    STOP  	11 0VO	0A1   	0A2   	0AVE  	      	      	      
00090	DUM2  	P2          	0 0   	0     	0     	      	      	      	      
00100	DUM3  	P2          	0 0   	0     	0     	      	      	      	      
00110	FRIC  	P2    FRIC  	11 0VO	0FRIC1	0AVE  	      	      	      	      
00120	T5    	P3    FRONT 	4 0VOW	0F1   	0F2   	0F3   	      	      	      
00130	T6    	P3    FRONT 	4 0VOW	0A1   	0A2   	0A3   	      	      	      
00140	T8    	P3    MID   	4 0VOW	0F1   	0F2   	0F3   	      	      	      
00150	T9    	P3    MID   	4 0VOW	0A1   	0A2   	0A3   	      	      	      
00160	T11   	P3    BACK  	4 0VOW	0F1   	0F2   	0F3   	      	      	      
00170	T12   	P3    BACK  	4 0VOW	0A1   	0A2   	0A3   	      	      	      
00180	VOC1  	P3    VOC1  	4 0VOW	0F1   	0F2   	0A2   	      	      	      
00190	VOC2  	P3    VOC2  	4 0VOW	0F1   	0F2   	0A2   	      	      	      
00200	DUM4  	P2          	0 0   	0     	0     	      	      	      	      
00210	DUM5  	P2          	0 0   	0     	0     	      	      	      	      
00220	BURST 	P2    BURST 	3 0FRI	0AVE  	0HPE  	      	      	      	      
00230	FRN   	P2    FRONT 	4 0VOW	0T5   	0T6   	      	      	      	      
00240	MID   	P2    MID   	4 0VOW	0T8   	0T9   	      	      	      	      
00250	BCK   	P2    BACK  	4 0VOW	0T11  	0T12  	      	      	      	      
00260	T17   	P3    FRIC1 	4 0FRI	0FP1  	0FP2  	0FZ   	      	      	      
00270	T18   	P3    FRIC1 	4 0FRI	0FP1A 	0FP2A 	0FZA  	      	      	      
00280	T20   	P3    FRIC2 	4 0FRI	0FP1  	0FP2  	0FZ   	      	      	      
00290	T21   	P3    FRIC2 	4 0FRI	0FP1A 	0FP2A 	0FZA  	      	      	      
00300	T23   	P3    FRIC3 	4 0FRI	0FP1  	0FP2  	0FZ   	      	      	      
00310	T24   	P3    FRIC3 	4 0FRI	0FP1A 	0FP2A 	0FZA  	      	      	      
00320	STOP  	P3    STOP  	11 0VO	0STOP1	0FRIC 	0LPE  	      	      	      
00330	DUM6  	P2          	0 0   	0     	0     	      	      	      	      
00340	DUM7  	P2          	0 0   	0     	0     	      	      	      	      
00350	FRNSS 	P3    FRONT 	4 0VOW	0FRN  	1FRN  	2FRN  	      	      	      
00360	MIDSS 	P3    MID   	4 0VOW	0MID  	1MID  	2MID  	      	      	      
00370	BCKSS 	P3    BACK  	4 0VOW	0BCK  	1BCK  	2BCK  	      	      	      
00380	FR1   	P2    FRIC1 	4 0FRI	0T17  	0T18  	      	      	      	      
00390	FR2   	P2    FRIC2 	4 0FRI	0T20  	0T21  	      	      	      	      
00400	FR3   	P2    FRIC3 	4 0FRI	0T23  	0T24  	      	      	      	      
00410	DUM8  	P2          	0 0   	0     	0     	      	      	      	      
00420	DUM9  	P2          	0 0   	0     	0     	      	      	      	      
00430	FR1SS 	P3    FRIC1 	4 0FRI	0FR1  	1FR1  	1FR2  	      	      	      
00440	FR2SS 	P3    FRIC2 	4 0FRI	0FR2  	1FR2  	2FR2  	      	      	      
00450	FR3SS 	P3    FRIC3 	4 0FRI	0FR3  	1FR3  	2FR3  	      	      	      
00460	DUM10 	P2          	0 0   	0     	0     	      	      	      	      
00470	DUM11 	P2          	0 0   	0     	0     	      	      	      	      
00480	FRVS  	Q3    EEAEE I 	4 0VOW	0FRNSS	0VOC1 	0VOC2 	      	      	      
00490	MIVS  	Q3    ASAAARA 	4 0VOW	0MIDSS	0VOC1 	0VOC2 	      	      	      
00500	BCVS  	Q3    OOU AWO 	4 0VOW	0BCKSS	0VOC1 	0VOC2 	      	      	      
00502	
00504	
00506					(13)         (cont.)
00508	
     

00010	FRES  	Q3    H S F SH	4 0FRI	0FR1SS	0FR2SS	0FR3SS	      	      	      
00020	BRSTS 	Q3    PBTBKBNU	3 0BUR	0FR1SS	0FR2SS	0FR3SS	      	      	      
00030	STOPS 	Q3    VSSINUNU	3 0STO	0AVE  	0F1   	0A1   	      	      	      
00040	DUM12 	P2          	0 0   	0     	0     	      	      	      	      
00050	DUM13 	P2          	0 0   	0     	0     	      	      	      	      
00060	NASA1 	P2    NASGLI	14 0VO	0A1   	0AVE  	      	      	      	      
00070	NASGLI	P2    NASGLI	14 0VO	0NASA1	0LPE  	      	      	      	      
00080	DUM14 	P2          	0 0   	0     	0     	      	      	      	      
00090	T51   	P3    NAS1  	3 0NAS	0F1   	0F2   	0F3   	      	      	      
00100	T61   	P3    NAS1  	3 0NAS	0A1   	0A2   	0A3   	      	      	      
00110	T71   	P3    NAS1  	3 0NAS	0NP   	0NZ   	0NZA  	      	      	      
00120	T91   	P3    NAS2  	3 0NAS	0A1   	0A2   	0A3   	      	      	      
00130	T81   	P3    NAS2  	3 0NAS	0F1   	0F2   	0F3   	      	      	      
00140	T101  	P3    NAS2  	3 0NAS	0NP   	0NZ   	0NZA  	      	      	      
00150	T52   	P3    GLI1  	3 0NAS	0F1   	0F2   	0F3   	      	      	      
00160	T62   	P3    GLI1  	3 0NAS	0A1   	0A2   	0A3   	      	      	      
00170	T112  	P3    GLI2  	3 0NAS	0F1   	0F2   	0F3   	      	      	      
00180	T122  	P3    GLI2  	3 0NAS	0A1   	0A2   	0A3   	      	      	      
00190	DUM15 	P2          	0 0   	0     	0     	      	      	      	      
00200	NAS1  	P3    NAS1  	3 0NAS	0T51  	0T61  	0T71  	      	      	      
00210	NAS2  	P3    NAS2  	3 0NAS	0T81  	0T91  	0T101 	      	      	      
00220	DUM16 	P2          	0 0   	0     	0     	      	      	      	      
00230	GLI1  	P3    GLI1  	3 0NAS	0T52  	0T62  	0NAS1 	      	      	      
00240	GLI2  	P3    GLI2  	3 0NAS	0T112 	0T122 	0NAS2 	      	      	      
00250	NAS1SS	P2    NAS1  	3 0NAS	0NAS1 	1NAS1 	      	      	      	      
00260	NAS2SS	P2    NAS2  	3 0NAS	0NAS2 	1NAS2 	      	      	      	      
00270	DUM17 	P2          	0 0   	0     	0     	      	      	      	      
00280	NASALS	Q3    M N NGNU	3 0NAS	0NP   	0NAS1S	0NAS2S	      	      	      
00290	GLIDES	Q3    W R L Y 	3 0NAS	0F2   	0GLI1 	0GLI2 	      	      	      
00300	DUM18 	P2          	0 0   	0     	0     	      	      	      	      
00310	VOFR  	P2    VOIFRI	0 0LPE	0VOICE	0FRIC 	      	      	      	      
00320	
00330	
00340		Table 3. The Current Signature Tables.
00350	
00360	
00370	
00380	
00390	
00400	
00410	
00420	
00430	
00440	
00450	
00460	
00470	
00480	
00490	
00500	
00502	
00504	
00506					(14)
00508	
     

00010		The following counters exist
00020		
00030		Name	Input	Level	Hysteresis
00040		VOI-C 	0VOICE	5	2
00050		FRI-C 	0FRIC 	4	3
00060		VOWEL 	0VOWEL	4	2
00070		BURST 	0BURST	3	3
00080		STOP  	0STOP 	4	3
00090		NASGLI	0NASGL	4	3
00100		EE    	Q1FRVS	3	3
00110		AE    	Q2FRVS	3	3
00120		E     	Q3FRVS	2	3
00130		I     	Q4FRVS	2	3
00140		AS    	Q1MIVS	2	3
00150		AA    	Q2MIVS	2	3
00160		AR    	Q3MIVS	2	3
00170		A     	Q4MIVS	3	3
00180		OO    	Q1BCVS	4	3
00190		U     	Q2BCVS	2	3
00200		AW    	Q3BCVS	2	3
00210		O     	Q4BCVS	4	3
00220		M     	Q1NASA	4	3
00230		N     	Q2NASA	4	3
00240		NG    	Q3NASA	2	3
00250		S     	Q2FRES	3	3
00260		F     	Q3FRES	4	3
00270		PB    	Q1BRST	3	3
00280		TB    	Q2BRST	4	3
00290		KB    	Q3BRST	2	3
00300		VS    	Q1STOP	2	3
00310		SI    	Q2STOP	4	3
00320		W     	Q1GLID	4	2
00330		R     	Q2GLID	3	3
00340		L     	Q3GLID	2	3
00350		Y     	Q4GLID	2	3
00360		VZ    	0VOFR 	2	3
00370		
00380	
00390		Table 5. List of Counters
00400	
00410	
00420	
00430	
00440	
00450	
00460	
00470	
00480	
00490	
00500	
00502	
00504	
00506					(15)
00508